108 research outputs found

    Optimal coding and the origins of Zipfian laws

    Full text link
    The problem of compression in standard information theory consists of assigning codes as short as possible to numbers. Here we consider the problem of optimal coding -- under an arbitrary coding scheme -- and show that it predicts Zipf's law of abbreviation, namely a tendency in natural languages for more frequent words to be shorter. We apply this result to investigate optimal coding also under so-called non-singular coding, a scheme where unique segmentation is not warranted but codes stand for a distinct number. Optimal non-singular coding predicts that the length of a word should grow approximately as the logarithm of its frequency rank, which is again consistent with Zipf's law of abbreviation. Optimal non-singular coding in combination with the maximum entropy principle also predicts Zipf's rank-frequency distribution. Furthermore, our findings on optimal non-singular coding challenge common beliefs about random typing. It turns out that random typing is in fact an optimal coding process, in stark contrast with the common assumption that it is detached from cost cutting considerations. Finally, we discuss the implications of optimal coding for the construction of a compact theory of Zipfian laws and other linguistic laws.Comment: in press in the Journal of Quantitative Linguistics; definition of concordant pair corrected, proofs polished, references update

    Morphological complexity of languages refle ts the settlement history of the Americas

    Get PDF
    Morphological complexity is widely believed to increase with sociolinguistic isolation, and to decrease with language spreads and absorption of L2 adult learner populations. However, this can be assessed only for communities with well-described histories. Morphological complexity has also been shown to be greater in higher-altitude languages, which are often sociolinguistically isolated, so we use altitude as an empirically determinable proxy for sociolinguistics. In past research, only a very few small locations have been surveyed and the measures of complexity used were family-specific and not easily generalizable. We apply several improved measures of complexity and show that the correlation holds, especially in the Andean regions of South America. We discuss the implications of the South American pattern for the settlement of the Americas and post-settlement prehistoric population formation.Peer reviewe

    Zipf's law of abbreviation as a language universal

    Get PDF
    Words that are used more frequently tend to be shorter. This statement is known as Zipf’s law of abbreviation. Here we perform the widest investigation of the presence of the law to date. In a sample of 1262 texts and 986 different languages - about 13% of the world’s language diversity - a negative correlation between word frequency and word length is found in all cases. In line with Zipf’s original proposal, we argue that this universal trend is likely to derive from fundamental principles of information processing and transfer

    The optimality of word lengths. Theoretical foundations and an empirical study

    Full text link
    Zipf's law of abbreviation, namely the tendency of more frequent words to be shorter, has been viewed as a manifestation of compression, i.e. the minimization of the length of forms -- a universal principle of natural communication. Although the claim that languages are optimized has become trendy, attempts to measure the degree of optimization of languages have been rather scarce. Here we present two optimality scores that are dualy normalized, namely, they are normalized with respect to both the minimum and the random baseline. We analyze the theoretical and statistical pros and cons of these and other scores. Harnessing the best score, we quantify for the first time the degree of optimality of word lengths in languages. This indicates that languages are optimized to 62 or 67 percent on average (depending on the source) when word lengths are measured in characters, and to 65 percent on average when word lengths are measured in time. In general, spoken word durations are more optimized than written word lengths in characters. Our work paves the way to measure the degree of optimality of the vocalizations or gestures of other species, and to compare them against written, spoken, or signed human languages.Comment: On the one hand, the article has been reduced: analyses of the law of abbreviation and some of the methods have been moved to another article; appendix B has been reduced. On the other hand, various parts have been rewritten for clarity; new figures have been added to ease the understanding of the scores; new citations added. Many typos have been correcte

    OCT-4 expression in follicular and luteal phase endometrium: a pilot study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The stem cell marker Octamer-4 (OCT-4) is expressed in human endometrium. Menstrual cycle-dependency of OCT-4 expression has not been investigated to date.</p> <p>Methods</p> <p>In a prospective, single center cohort study of 98 women undergoing hysteroscopy during the follicular (n = 49) and the luteal (n = 40) phases of the menstrual cycle, we obtained endometrial samples. Specimens were investigated for OCT-4 expression on the mRNA and protein levels using reverse transcriptase polymerase chain reaction (RT-PCR) and immunohistochemistry. Expression of OCT-4 was correlated to menstrual cycle phase.</p> <p>Results</p> <p>Of 89 women sampled, 49 were in the follicular phase and 40 were in the luteal phase. OCT-4 mRNA was detected in all samples. Increased OCT-4 mRNA levels in the follicular and luteal phases was found in 35/49 (71%) and 27/40 (68%) of women, respectively (p = 0.9). Increased expression of OCT-4 protein was identified in 56/89 (63%) samples. Increased expression of OCT-4 protein in the follicular and luteal phases was found in 33/49 (67%) and 23/40 (58%) of women, respectively (p = 0.5).</p> <p>Conclusions</p> <p>On the mRNA and protein levels, OCT-4 is not differentially expressed during the menstrual cycle. Endometrial OCT-4 is not involved in or modulated by hormone-induced cyclical changes of the endometrium.</p

    Adaptive Communication: Languages with More Non-Native Speakers Tend to Have Fewer Word Forms.

    Get PDF
    Explaining the diversity of languages across the world is one of the central aims of typological, historical, and evolutionary linguistics. We consider the effect of language contact-the number of non-native speakers a language has-on the way languages change and evolve. By analysing hundreds of languages within and across language families, regions, and text types, we show that languages with greater levels of contact typically employ fewer word forms to encode the same information content (a property we refer to as lexical diversity). Based on three types of statistical analyses, we demonstrate that this variance can in part be explained by the impact of non-native speakers on information encoding strategies. Finally, we argue that languages are information encoding systems shaped by the varying needs of their speakers. Language evolution and change should be modeled as the co-evolution of multiple intertwined adaptive systems: On one hand, the structure of human societies and human learning capabilities, and on the other, the structure of language.CB is funded by an Arts and Humanities Research Council (UK) doctoral grant (reference number: 04325), a grant from the Cambridge Home and European Scholarship Scheme, and by Cambridge English, University of Cambridge. AV is supported by ERC grant 'The evolution of human languages' (reference number: 268744). DK is supported by EPSRC grant EP/I037512/1. FH is funded by a Benefactor's Scholarship of St. John's College, Cambridge. PB is supported by Cambridge English, University of Cambridge.This is the final version. It first appeared at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0128254

    Precise Black Hole Masses From Megamaser Disks: Black Hole-Bulge Relations at Low Mass

    Full text link
    The black hole (BH)-bulge correlations have greatly influenced the last decade of effort to understand galaxy evolution. Current knowledge of these correlations is limited predominantly to high BH masses (M_BH> 10^8 M_sun) that can be measured using direct stellar, gas, and maser kinematics. These objects, however, do not represent the demographics of more typical L< L* galaxies. This study transcends prior limitations to probe BHs that are an order of magnitude lower in mass, using BH mass measurements derived from the dynamics of H_2O megamasers in circumnuclear disks. The masers trace the Keplerian rotation of circumnuclear molecular disks starting at radii of a few tenths of a pc from the central BH. Modeling of the rotation curves, presented by Kuo et al. (2010), yields BH masses with exquisite precision. We present stellar velocity dispersion measurements for a sample of nine megamaser disk galaxies based on long-slit observations using the B&C spectrograph on the Dupont telescope and the DIS spectrograph on the 3.5m telescope at Apache Point. We also perform bulge-to-disk decomposition of a subset of five of these galaxies with SDSS imaging. The maser galaxies as a group fall below the M_BH-sigma* relation defined by elliptical galaxies. We show, now with very precise BH mass measurements, that the low-scatter power-law relation between M_BH and sigma* seen in elliptical galaxies is not universal. The elliptical galaxy M_BH-sigma* relation cannot be used to derive the BH mass function at low mass or the zeropoint for active BH masses. The processes (perhaps BH self-regulation or minor merging) that operate at higher mass have not effectively established an M_BH-sigma* relation in this low-mass regime.Comment: 21 pages, 14 figures, accepted for publication in the Astrophysical Journa

    The characteristic blue spectra of accretion disks in quasars as uncovered in the infrared

    Full text link
    Quasars are thought to be powered by supermassive black holes accreting surrounding gas. Central to this picture is a putative accretion disk which is believed to be the source of the majority of the radiative output. It is well known, however, that the most extensively studied disk model -- an optically thick disk which is heated locally by the dissipation of gravitational binding energy -- is apparently contradicted by observations in a few major respects. In particular, the model predicts a specific blue spectral shape asymptotically from the visible to the near-infrared, but this is not generally seen in the visible wavelength region where the disk spectrum is observable. A crucial difficulty was that, toward the infrared, the disk spectrum starts to be hidden under strong hot dust emission from much larger but hitherto unresolved scales, and thus has essentially been impossible to observe. Here we report observations of polarized light interior to the dust-emiting region that enable us to uncover this near-infrared disk spectrum in several quasars. The revealed spectra show that the near-infrared disk spectrum is indeed as blue as predicted. This indicates that, at least for the outer near-infrared-emitting radii, the standard picture of the locally heated disk is approximately correct. The model problems at shorter wavelengths should then be directed toward a better understanding of the inner parts of the revealed disk. The newly uncovered disk emission at large radii, with more future measurements, will also shed totally new light on the unanswered critical question of how and where the disk ends.Comment: published in Nature, 24 July 2008 issue. Supplementary Information can be found at http://www.mpifr-bonn.mpg.de/div/ir-interferometry/suppl_info.pdf Published version can be accessed from http://www.nature.com/nature/journal/v454/n7203/pdf/nature07114.pd

    AMUSE-Virgo II. Down-sizing in black hole accretion

    Get PDF
    (Abridged) We complete the census of nuclear X-ray activity in 100 early type Virgo galaxies observed by the Chandra X-ray Telescope as part of the AMUSE-Virgo survey, down to a (3sigma) limiting luminosity of 3.7E+38 erg/s over 0.5-7 keV. The stellar mass distribution of the targeted sample, which is mostly composed of formally `inactive' galaxies, peaks below 1E+10 M_Sun, a regime where the very existence of nuclear super-massive black holes (SMBHs) is debated. Out of 100 objects, 32 show a nuclear X-ray source, including 6 hybrid nuclei which also host a massive nuclear cluster as visible from archival HST images. After carefully accounting for contamination from nuclear low-mass X-ray binaries based on the shape and normalization of their X-ray luminosity function, we conclude that between 24-34% of the galaxies in our sample host a X-ray active SMBH (at the 95% C.L.). This sets a firm lower limit to the black hole occupation fraction in nearby bulges within a cluster environment. At face value, the active fraction -down to our luminosity limit- is found to increase with host stellar mass. However, taking into account selection effects, we find that the average Eddington-scaled X-ray luminosity scales with black hole mass as M_BH^(-0.62^{+0.13}_{-0.12}), with an intrinsic scatter of 0.46^({+0.08}_{-0.06}) dex. This finding can be interpreted as observational evidence for `down-sizing' of black hole accretion in local early types, that is, low mass black holes shine relatively closer to their Eddington limit than higher mass objects. As a consequence, the fraction of active galaxies, defined as those above a fixed X-ray Eddington ratio, decreases with increasing black hole mass.Comment: Accepted for publication in ApJ (no changes wrt v1

    Mental health impact among hospital staff in the aftermath of the Nice 2016 terror attack: the ECHOS de Nice study

    Get PDF
    BACKGROUND: The Nice terror attack of July 14, 2016 resulted in 84 deaths and 434 injured, with many hospital staff exposed to the attack, either as bystanders on site at the time of the attack ('bystander exposure') who may or may not have provided care to attack victims subsequently, or as care providers to victims only ('professional exposure only'). The objective of this study is to describe the impact on mental health among hospital staff by category of exposure with a particular focus on those with 'professional exposure only', and to assess their use of psychological support resources. METHOD: An observational, cross-sectional, multicenter study conducted from 06/20/2017 to 10/31/2017 among all staff of two healthcare institutions in Nice, using a web questionnaire. Collected data included social, demographic and professional characteristics; trauma exposure category ('bystanders to the attack'; 'professional exposure only'; 'unexposed'); indicators of psychological impact (Hospital Anxiety and Depression Scale); PTSD (PCL-5) level; support sought. Responders could enter open comments in each section of the questionnaire, which were processed by inductive analysis. RESULTS: 804 staff members' questionnaires were analysed. Among responding staff, 488 were exposed (61%): 203 were 'bystanders to the attack', 285 had 'professional exposure only'. The staff with 'professional exposure only' reported anxiety (13.2%), depression (4.6%), suicidal thoughts (5.5%); rates of full PTSD was 9.4% and of partial PTSD, 17.7%. Multivariate analysis in the 'professional exposure only' category showed that the following characteristics were associated with full or partial PTSD: female gender (OR = 2.79; 95% CI = 1.19-6.56, p = 0.019); social isolation (OR = 3.80; 95% CI = 1.30-11.16, p = 0.015); having been confronted with an unfamiliar task (OR = 3.04; 95% CI = 1.18-7.85; p = 0.022). Lastly, 70.6% of the staff with 'professional exposure only' with full PTSD did not seek psychological support. CONCLUSION: Despite a significant impact on mental health, few staff with 'professional exposure only' sought psychological support. Robust prevention and follow-up programs must be developed for hospital staff, in order to manage the health hazards they face when exposed to exceptional health-related events such as mass terror attacks. STUDY REGISTRATION: Ethical approval for the trial was obtained from the National Ethics Committee for Human Research (RCBID N° 2017-A00812-51)
    • 

    corecore